Direct and Unbiased Multiple Imputation Methods for Missing Values of Categorical Variables
نویسندگان
چکیده
Missing data is a common problem in statistical analyses. To make use of information with incomplete observation, missing values can be imputed so that standard methods used to analyze the data. Variables are often categorical and miss ing pattern may not monotone. Currently, commonly imputation for non-monotone do allow di rect inclusion variables. Categorical variables converted numerical before imputation. For many applications, those must then back values. However, this conversion introduces bias which seriously affect subsequent In paper, we propose two direct pattern: approach incorporated expectation maximization algorithm new algorithm: imputation-maximization algorithm. Simulation studies show both perform better than method using vari able conversion. An application real provided compare variable
منابع مشابه
Multiple Imputation of Missing Categorical and Continuous Values via Bayesian Mixture Models with Local Dependence
We present a nonparametric Bayesian joint model for multivariate continuous and categorical variables, with the intention of developing a flexible engine for multiple imputation of missing values. The model fuses Dirichlet process mixtures of multinomial distributions for categorical variables with Dirichlet process mixtures of multivariate normal distributions for continuous variables. We inco...
متن کاملA nonparametric multiple imputation approach for missing categorical data
BACKGROUND Incomplete categorical variables with more than two categories are common in public health data. However, most of the existing missing-data methods do not use the information from nonresponse (missingness) probabilities. METHODS We propose a nearest-neighbour multiple imputation approach to impute a missing at random categorical outcome and to estimate the proportion of each catego...
متن کاملMissing Data and Imputation Methods in Partition of Variables
We deal with the effect of missing data under a ”Missing at Random Model” on classification of variables with non hierarchical methods. The partitions are compared by the Rand’s index.
متن کاملBayesian Multiple Imputation and Maximum Likelihood Methods for Missing Data
Bayesian multiple imputation and maximum likelihood provide useful strategy for dealing with dataset including missing values. Imputation methods affect the significance of test results and the quality of estimates. In this paper, the general procedures of multiple imputation and maximum likelihood described which include the normal-based analysis of a multiple imputed dataset. A Monte Carlo si...
متن کاملMultiple Imputation of Missing or Faulty Values Under Linear Constraints
Many statistical agencies, survey organizations, and research centers collect data that su↵er from item nonresponse and erroneous or inconsistent values. These data may be required to satisfy linear constraints, e.g., bounds on individual variables and inequalities for ratios or sums of variables. Often these constraints are designed to identify faulty values, which then are blanked and imputed...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of data science
سال: 2021
ISSN: ['1680-743X', '1683-8602']
DOI: https://doi.org/10.6339/jds.201207_10(3).0007